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Auditing has been defined as a process for reducing the level of information risk associated with a set of financial 
statements to an acceptable level [Robertson and Lowers, 1 999]. It is assumed that the ultimate goal of the audit is 
to communicate through an audit report an opinion on the quality and reliability of a client's financial statements [ 
pAICPA, AU 1 10, Statement on Auditing Standards No. 1]. The audit report serves to reduce information risk by 
communicating to users that the financial statements are (are not) reliable and credible, and contain no (contain) 
materially false or misleading information [Elliott and Jacobson, 1998; Carmichael, 1999]. 

Applied at the planning stage of an audit, analytical procedures have the potential to pinpoint areas of risk and 
influence the nature of an entire audit engagement. The goal of such preliminary analytical procedures is to identify 
and focus attention on high risk areas and to design audit programs that are appropriate for the level of risk 
indicated in the financial statements and related disclosures. These procedures result in the creation of multiple 
signals that are difficult to combine and process into a single composite risk profile. Consequently, an auditor may 
underestimate the risk associated with a particular engagement and design an ineffective audit program. The net 
effect could be the issuance of a clean audit opinion when a different type of audit report is warranted. 

A composite indicator of initial the information risk associated with assertions management make in financial 
statements can be viewed in terms of the type of audit opinion that is signaled a priori given the diverse information 
and signals generated by preliminary analytical procedures [Shank and Murdock, 1978]. A standard unqualified 
audit report, while not providing complete assurance that financial statements are free of errors and irregularities, 
signals that the information risk associated with the financial statements are immaterial. In contrast, a qualified audit 
report signals a material level of information risk in a client's financial statements. 

Using the audit report as a surrogate, this paper explores the effectiveness of neural networks (NNs) in assessing 
preliminary information risk. The primary research question examined is whether NNs can accurately predict 
information risk based on a set of company and industry financial ratios that have been used in prior studies to 
explore factors associated with various types of audit opinions. The paper sheds light on the potential use of neural 
networks as a preliminary risk assessment tool during the planning stage of an audit and extends prior research in 
the area. 

NNs, which are classifiers by nature, offer the capacity to simultaneously consider multiple types of evidence and 
can assist auditors in assessing risks and making judgments [Coakley and Brown, 1 993; Lenard, Alam, and Meday, 
1995; Davis, Massey, and Lowell, 1997; Green and Choi, 1997; Fanning and Cogger, 1998; Calderon and Green, 
1999]. Inspired by studies of the brain and nervous system, a neural network (NN) is a model that uses complex 
algorithms to evaluate many pieces of information simultaneously in making classifications or predictions. 

The basic building block of an NN is a simulated neuron or node (see Figure 1). The neuron depicted in Figure 1 
(neuron j) receives inputs (xi) form other nodes and multiplies each input by its synaptic weight, wij. The resulting 
products are sunmed within the neuron to produce an activation, u^ub j A = (Epsilonjw^ub ij A x A sub i A . The 
activation is transformed using a transfer function, Sfu^ub j A ), to produce the node's output. The S-shaped 
sigmoidal transfer function, S(u A sub j A ) = 1/(1+e A sup -u A ), where u is the activation, is often used. As shown in 
Figure 2, neurons in a network are grouped into layers depending on their connection(s) to the external 
environment. While all NNs have a single input and output layer, a network structure may contain one or several 
hidden layers that enable the network to model complex functions. 

NNs produce predictions or classifications through a supervised or unsupervised learning paradigm [Gurney, 1997]. 
In supervised learning, the data set includes known input (analogous to independent variables in regression) and 
known output (analogous to dependent variables in regression) and the network "learns" the patterns of input 
associated with known output. Unsupervised learning does not utilize known outcomes and has not been applied in 
published neural network studies in the audit domain. Supervised training seeks to minimize the difference 
between the output predicted by the network and the actual output. The algorithms used often result in a network 
that predicts the actual output with minimal error. However, the network is effective only if it can generalize and 
make accurate predictions from previously unseen input [Gurney, 1997]. 
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FIGURE 1: 
FIGURE 2: 



NNs provide several advantages over traditional advanced statistical techniques such as discriminant analysis and 
logistic regression. It is well known, for example, that unlike traditional statistical techniques, NNs are non-linear 
and do not require any a priori assumptions about the distribution properties of the underlying data. NNs learn from 
the cases analyzed by constructing an input-output mapping, which is analogous to the processes used in non- 
parametric statistical methods [Haykin, 1994]. NNs learn the patterns that are evident in a particular problem and 
create a knowledge base for prediction or classification. Knowledge is represented in a NN by its structure and 
activation state, and the network can readily adapt to minor changes in the underlying situation [Haykin, 1994). 

Academics and business professionals have used NNs for a variety of applications that require classifications and 
predictions based on large quantities of data and complex relationships that do not conform well to explicit logical 
rules and algorithms [Wong, Bodnovich, and Selvi@ 1997]. Prior research in the auditing domain has examined the 
relative performance of NN models in predicting the auditor's going concern opinion [Lenard et al., 1995; Coats and 
Fant, 1993; Hansen, McDonald, and Stice, 1992], bankruptcy prediction [Barniv, Agarwal, and Leach, 1997; Bell, 
1997; Yang, Piatt, and Piatt, 1999; Altman, Marco, and Varetto,1994; Zurada, Foster, Ward, and Baker, 1999], 
control risk assessment [Davis, Massey, and Lowell, 1997], and identification of errors and fraud in financial 
statements [Fanning, Cogger, and Srivasta, 1995; Fanning and Cogger, 1998; Green and ChoL 1997; Coakley and 
Brown, 1993]. 

As a group, these studies suggest that NN models may be effective in preliminary audit risk assessment. However, 
they report conflicting results, making it difficult to conclude that NNs applied to the audit risk assessment domain 
are superior to sophisticated statistical techniques such as discriminant analysis. In addition, all three studies that 
examined going concern audit opinions [Lenard et al., 1995; Coats and Fant, 1993; Hansen et al., 1992] use data 
that precede Statement on Auditing Standards No 58: Reports on Audited Financial Statements. [£# AICPA , 1988] 
and Statement on Auditing Standards No 59: The Auditor's Consideration of an Entity's Ability to Continue as a 
Going Concern [ @AICPA . 1989). However, these standards have changed the audit report and the guidelines on 
the time horizon for considering going concern uncertainties. 

The potential use of NN models as a tool in preliminary audit risk assessment is examined in this paper. The audit 
opinion is used as a surrogate for a composite risk indicator [Shank and Murdock, 1978]. In this paper, the only 
research question examined is whether NN models are effective in assessing information risk. The remainder of 
this paper is organized into five separate sections: (1) Background; (2) Method; (3) Results; (4) Limitations; and (5) 
Discussion and Conclusion. 

BACKGROUND 

Within the auditing domain, NN research has focused on predicting going concern risk and business failure [Lenard 
et al., 1985; Coats and Fant, 1993; Hansen et al., 1992; O'Leary, 1998], control risk assessment [Davis ct al., 
1997], and assessment of the risk of errors and fraud in financial reports [Fanning et al., 1995; Cogger and 
Fanning, 1997; Green and ChoL 1997; Fanning and Cogger, 1998]. Most studies compare the performance of NNs 
with models such as logistic regression, probit, and discriminant analysis. These studies are reviewed in this 



http://proquestumixom/pqdweb?index=0&did=000000047273382&SrchMode=l& 3/2/04 



section. 



Going Concern Audit Opinion 

Motivated by the ability of NNs to handle data that would violate the multivariate normality assumption and other 
restrictive assumptions of multiple discriminant analysis, Coats and Fant [1993] used NNs to model the going 
concern audit opinion. Their NN model, which used COMPUSTAT data for the period 1970 to 1989, correctly 
predicted going concern opinions at least 80 percent of the time over a lead time of up to four years. They conclude 
that NNs perform better than multiple disc 'minant analysis in identifying firms that eventually will receive a going 
concern audit opinion. 

A study by Lenard et al. [1995], which attempted to predict which firms would receive a modified audit report for 
going concern uncertainty, compared a neural network model with a logit model in a study. Based on an analysis of 
a sample of going-concern reports that predate SAS 58 [ @AICPA , 1988] and SAS 59 [ ©AiCPA . 1989], the authors 
report that the neural network model had an accuracy level of 95 percent, which was significantly higher than the 
accuracy of their logit model. Lenard et al. [1 995], therefore, recommended their neural network model as a robust 
alternative model for supporting auditors' assessment of going concern uncertainty. 

The findings of Coats and Fant [ 1 993 ] and Lenard et al. [ 1 995] appear to contradict the results of an earlier study 
by Hansen et al. 1992], which found that NN models perform no better than other advanced statistical models in 
predicting auditor's going concern opinion. Hansen et al. [1992] examined 40 companies that received a going 
concern audit opinion and 40 companies that exhibited financial distress and did not receive a going concern 
qualification. They compared the performance of a generalized qualitative response model, a decision tree 
inductive model (ID3), and a neural network model in predicting the auditor's going concern opinion. The NN model 
correctly classified 78 percent of the firms included in the researchers' holdout samples. The generalized qualitative 
response model performed slightly better (80.5 percent accuracy). The inductive classification model (75 percent 
accuracy) was marginally less accurate than the NN model. 

An interesting feature of the Hansen et al. [1992] study is that the authors generated 30 random samples of 40 
firms as training sets and used the remaining 40 firms as testing sets. This allowed them to replicate the 
classification exercise 30 times and compute an average classification accuracy and a corresponding standard 
deviation. The results indicate that the NN model had the lowest standard deviation. This implies that the NN model 
predicted with more consistency, albeit less accurately, than the other advanced statistical models used in the 
study. 

Bankruptcy Prediction 

A number of studies have examined the performance of NNs in predicting bankruptcy. Table I presents a list of 
published studies and a brief summary of their findings. Although it has been observed that NNs perform as well as 
or outperform other methods that have been used to predict bankruptcy [O'Leary, 1998], the findings reported in 
Table I suggest that it may not be appropriate to conclude that NNs are superior to other advanced statistical 
models in predicting bankruptcy. NN models often memorize patterns in a researcher's training sample, and may 
not perform as well on the testing sample [Gurney, 1997]. This may have been the case in the Barniv et al. [1997] 
study, which reported strong results for the training sample and poor results for the testing sample. Bell [1997] 
reported that NNs did not perform as well as a logit model. Similarly, Etheridge and Sriram [1997] reported that a 
naive model performed better than NNs as the bankruptcy prediction horizon extended beyond one year. In a 
recent study, Yang et al. [1999] found that discriminant analysis more accurately predicted the status of bankrupt oil 
and gas companies than either a probabilistic neural network model or a back-propagation neural network model. 
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TABLE 1 



Despite poor findings reported in some prior research, there are indications that NNs may be effective in predicting 
financial distress. In an intensive study, involving 213 distressed companies and 213 healthy companies, Altman et 
al. [1994] concluded that NNs have significant capacities for predicting the financial health of companies. Results 
reported for their NN model were very close or superior to results obtained through discriminant analysis. 

The study by Zurada et al. [1999], which may explain some of the conflicting results of prior studies, argued that 
NNs are not superior to logit when modeling bankruptcy with a traditional dichotomous response variable. However, 
NN models are superior to logit models when an ordinal scale (0=healthy, I =dividend cut, 2=loan default or 
favorable debt accommodation, 3=bankruptcy) is used to represent financial distress. Yet, an examination of these 
authors' results indicate that much of the reported superiority of their NN model resided in its accuracy in classifying 
healthy companies. In all cases, the NN model performed worse than the regression model in classifying bankrupt 
firms.' 

Notwithstanding issues related to the validity of the claim that their multi-category response variable is ordinal, the 
authors address two important methodological issues. The first issue has to do with whether the method used to 
represent the output (response) variable in financial distress prediction has a significant impact on prediction 
accuracy. This is an unresolved issue that merits further investigation using a more refined method to represent the 
output variable. The second issue is the use of an unbalanced sample of distressed and healthy companies in 
training and testing the neural network model. This intuitive, albeit seldom used, sampling approach is often 
recommended as a means of recognizing the true frequency distribution of the samples of distressed and healthy 
companies encountered in business [Jain and Nag, 1997]. 

Control Risk Assessment 

Auditors evaluate a considerable amount of data when assessing effectiveness of an internal control system. They 
process large amounts of data by using a combination of auditing guidelines and heuristics gained through audit 
experience to judge the amount of testing needed to reduce audit risk. Davis et al. [1 997] describe an innovative 
application of a hybrid intelligent system that integrated an expert system and a neural network model to formulate 
a preliminary control risk assessment. The expert system was used to provide a user interface and to represent, 
through a series of logical rules, the internal control structure information and relationships. Preprocessed data from 
the expert system was fed to the NN, which was used to model the more complex relationships in the auditors' 
control risk assessment. 

Training and testing data came from an experiment that involved 64 auditors from Grant Thorton. The experiment 
required each auditor to make a control risk assessment for a subset of the revenue cycle (sales) of a small, a 
medium, or large company. Auditors identified their respective cue sets and indicated whether or not they expected 
to rely on each cue. In addition, auditors rated their control risk assessment for each case on an interval scale 
ranging from zero to 100. The experiment yielded a total of 64 observations, 210 input variables, and one output 
variable. Thirty-two observations were used to train the network and the remaining 32 were used to test the trained 
network. Though their small sample size (64 cases) relative to the large number of input variables (240) may 
jeopardize the validity of the network [Jain and Nag, 1997], the authors report that the model was able to classify 
the auditors' control risk assessment in the testing sample with a 78 percent accuracy rate. 
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Although Davis et al. [1 997] focus on a narrow subset of the control structure and use experimental data that do not 
fully capture the complexities of an audit, their research demonstrate that NNs can be coupled with expert systems 
technology to model control risk assessment. 

Unlike traditional expert systems where knowledge is represented in the form of rules, NNs learn patterns and 
relationships in complex data and generate their own rules through an iterative learning process. The two 
techniques complement each other in certain judgment tasks where some degree of preliminary data processing 
that conforms well to logical rules is needed prior to an assessment of complex relationships that involves a large 
number of quantifiable variables. 

Errors And Fraud 

A small number of studies have examined the effectiveness of NN models in detecting financial statement errors 
and fraud. These studies suggest that NNs may produce lower Type I and Type 2 error rates than simple analytical 
procedures [Green and ChoL 1997]. However, results reported by Fanning et al. [ 1995], Cogger and Fanning 
[1997], Fanning and Cogger [1998], and Coakley and Brown [1993] cast doubt as to whether NN models perform 
significantly better than sophisticated statistical procedures in detecting errors and fraud. 

Fanning et al. [1995] compared the performance of a logit model with the performance of two neural network 
architectures in classifying companies into fraud and non-fraud categories. The authors used a total of 24 red flags 
from the Loebbecke and Willingham [1988] fraud risk assessment model, 77 fraud cases, and 305 non-fraud cases. 
Applied to a learning sample of 37 fraud and 1 1 3 non-fraud companies, the Type I error rates for the logit model 
and their two neural network models were nine percent, nine percent, and eight percent, respectively. Similarly, 
their Type 2 error rates were 30 percent, 25 percent, and 30 percent for the logit model and the two -NN models, 
respectively. Applied to the testing sample, the NN models had a 4 percent Type 1 error rate and a 70 percent Type 
2 error rate. In contrast the logit model had a nine percent Type I error rate and a Type 2 error rate of 25 percent 
when applied to a hold-out sample. 

Cogger and Fanning [1997] used an adaptive logic neural network model to classify 204 companies (150 in the 
training sample and 54 in the test sample) into fraud and non-fraud categories. The sample included an equal 
number of fraudulent and non-fraudulent companies. A total of 21 financial and nonfinancial variables were used in 
their study. The training sample had a Type I error rate of 1 3 percent and a Type 2 error rate of 1 1 percent; 
whereas, the holdout sample had a Type I error rate of 33 percent and a Type 2 error rate of 48 percent. Without 
providing specific details, the authors observed that their results may be superior to linear and quadratic 
discriminant analysis, which they claimed to have difficulty in achieving better than 50 percent accuracy. 

Using the same sample as Cogger and Fanning [1997], Fanning and Cogger [1998] compared the performance of 
logistic regression, linear discriminant analysis, quadratic discriminant analysis, and a neural network model. The 
overall prediction accuracy of the NN model on both the training and testing samples was significantly better than 
the performance of the other models. Based on its performance on the testing sample, the NN model had an overall 
prediction accuracy rate of 63 percent (75 percent based on the training sample), compared with an average overall 
prediction accuracy of approximately 50 percent (70 percent based on the training sample) for the other models. 

However, when these results are considered in terms of Type I and Type 2 error rates, it becomes evident that the 
NN model performed marginally better than other advanced techniques in their ability to correctly signal non-fraud 
cases, but performed no better than the other models in terms of predicting fraud cases. Type I and Type 2 error 
rates for the NN model were 20 percent and 31 percent, respectively, based on the training sample and 41 percent 
and 34 percent, respectively, based on the testing sample. In contrast, the average Type I and Type 2 error rates 
for the other models tested were 35.33 percent and 24 percent, respectively, based on the training sample and 
69.33 percent and 29.33 percent, respectively, based on the testing sample. 

Green and Choi [ 1 997] used actual data from a sample of 86 companies with reported frauds in their sales and 
receivable applications and a matched sample of 86 companies with no reported frauds. Inputs in the NN model 
included net sales, accounts receivable, allowance for doubtful accounts, and five related financial ratios. The best 
performing NN model based on all ratios and account balances had a Type I error rate of 15.09 percent, and a 
Type II error rate of 21 .95 percent. When using only ratio data, the Type I and Type II error rates were lowered to 
12.24 percent and 6.52 percent, respectively. These error rates are substantially lower than those reported in 
Fanning and Cogger (1998) and in prior studies, such as Calderon and Green [1995], that examined the 
effectiveness of simple analytical procedures in signaling financial statement fraud. 

The best performing models in Green and Calderon [1995] produced Type I and Type 2 error rates of 66.35 percent 
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and 31 .53 percent, respectively. Fanning and Cogger [1998] reported Type I and Type 2 error rates of 41 percent 
and 34 percent, respectively. Green and Choi [1 997] concluded that their results support future use of NNs as a 
fraud-risk assessment tool. However, they did not compare the performance of their NN model with the 
performance of a traditional sophisticated statistical method such as logistic regression or discriminant analysis. 
Although their results are encouraging, it is not known whether their NN model would indeed perform better than 
traditional statistical procedures. 

While the other studies reviewed in this section used actual field data, Coakley and Brown [1993] used a 
combination of actual and simulated data to examine whether NN models are more effective as an analytical review 
technique than other models described in the literature. The authors simulated material and non-material errors in 
seven of fifteen financial statement items used in the study, and compared the performance of NNs, regression, 
and a financial ratio approach in signaling those errors. They report that although the NN model produced the 
lowest composite error rate (sum of Type I and Type 2 errors), none of the three approaches performed 
significantly better than a purely random process. 

Overall, the studies that have used NNs as a tool in signaling errors and fraud have not yielded the. high prediction 
accuracy rates reported in some of the bankruptcy/going concern studies that used NNs. AM of the studies, except 
Green and Choi [1997], reported high error rates. Though some of the results, particularly Green and Choi [1997], 
are encouraging, further research is needed to examine whether NN models can be more effective than traditional 
statistical models in assessing the risk of errors and fraud in financial statements. 

METHOD 

This paper uses a diverse set of financial information to classify financial statements into information risk classes 
that reflect the type of audit report issued for those financial statements. Consistent with the risk reduction role of 
the audit, a dichotomous classification of the audit report is used as surrogates for information risk [Shank and 
Murdock, 1978]. A dichotomous classification scheme is used as it allows one to surrogate high and low information 
risk in a relatively unambiguous manner, and facilitates data collection and analysis. 

An unqualified report is classified as a low information risk signal. For purposes of this study, an unqualified audit 
report is defined as one that reflects no exceptions or explanations as to the application of accounting principles 
and financial statement disclosures. Only financial statements that received a standard unqualified audit opinion 
were grouped into the low information risk category. Financial statements accompanied by audit reports in which 
the auditors did not express an opinion or expressed a qualified/adverse opinion were classified as high risk. A 
qualified7adverse audit opinion was defined to include reports that indicate limitations on the scope of the audit, 
unsatisfactory presentation of financial statement information, or an adverse opinion regarding the financial 
statements of a company. 

Variables 

Twenty-two explanatory variables were used to model the dichotomous information risk classification of financial 
statements. These variables have been used in prior studies that model the going concern opinion [Lenard et al., 
1995; Hansen et al., 1992], financial distress [Yang et al., 1999; Coats and Fant, 1993] and audit opinion decisions 
[Krishnan and Krishnan, 1996; Dopuch, Holthausen and Leftwich, 1987]. They include: 

Receivables to total assets (RTA) 

Inventory to total assets (INVTA) 

Net income to total assets (NITA) 

Current assets to total assets (CATA) 

Current assets to current liabilities (CACL) 

Cash to total assets (CHTA) 

Debt to total assets (DTA) 
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Log of total assets (LogTA) 



Log of net sales (LogSale) 
Beta 

Auditor (Big 5 or other) 
Major stock holders 

In addition to the 12 variables listed above, variables 1 to 10 were benchmarked against the corresponding industry 
variable using the following formula: 

B A sub ic A = R^ub ic/R A sub id A 

where: 

B A sub ic A = benchmarked variable i (i=1 ... 10) for each company; 
R A sub ic A = variable i for each company 

R^ub id A = variable i for each company's industry based on a fourdigit SIC code 

A significant deviation from industry norms is generally considered to be a red flag in analytical auditing [Green and 
Calderon, 1995]. 

Asset composition variables such as RTA, INVTA, CAT A, and CHTA are included because prior research on the 
analytical procedures [Green and Calderon, 1995; Green and ChoL 1997] and auditor litigation exposure [St. Pierre 
and Anderson, 1984; Stice, 1991] suggest that the revenue and procurement cycles are high risk areas in auditing. 
These cycles account for a high proportion of financial statement errors and frauds investigated by the Securities 
and Exchange Commission [Hylas and Ashton, 1982; Green and Calderon, 1995] and for a substantial number of 
lawsuits involving auditors [St. Pierre and Anderson, 1984]. Stice [1991] reports that asset composition variables, 
including RTA and INVTA, are significantly associated with auditor litigation. 

DTA captures financial leverage, which Krishnan and Krishnan [1996] found to be a statistically significant factor in 
distinguishing between qualified and unqualified audit opinions. Similarly, Lenard et at. [ 1995] indicate that debt to 
total asset is a significant variable in distinguishing between going concern and non-going concern audit opinions. 
CHTA and CACL measure short-term liquidity and, along with DTA, serve as surrogates for financial distress. Beta 
is used as a surrogate for the market risk associated with the audit client [Krishnan and Krishnan, 1 996; Dopuch et 
al., 1987]. 

It has been posited that the audit report qualification decision is a two-stage process [Krishnan and Krishnan, 
1996]. The first stage involves an assessment of the risk of a material error, financial statement fraud, or some 
other reportable condition such as a significant departure from GAAP, scope limitation, and so on. The second 
stage involves an assessment of the impact on the auditor as a result of issuing or not issuing a qualified opinion, 
including litigation risk and the risk and exposure associated with losing the client. 

Stice [1991] reported that both client-related factors and auditor-related factors affect litigation risk. Krishnan and 
Krishnan [1996] found that, other things being equal, an audit firm is more likely to issue a qualified opinion when 
litigation risk and outside ownership of the client are high, and when the client is less important in the auditor's 
engagement portfolio. The size of the audit client (sales, assets, and/or market capitalization) and the size of the 
audit firm (big-five or non-big-five) have both been explored as factors that influence the audit qualification decision 
[Shank and Murdock, 1978; Krishnan and Krishnan, 1996; Geiger, Raghunandan and Rama, 1998]. The log of total 
assets and the log of total sales are used as surrogates for the size of an audit client. The variable major 
stockholders-a variable representing whether the company is a subsidiary of a publicly traded company, a 
subsidiary of company that is not publicly traded, a public company that is not traded on a major stock exchange, or 
a company that has undergone a leverage buyout-is used as a surrogate for the nature the company's stock 
ownership. 
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Data 



Data for the study was extracted from COMPUSTAT's database of active companies. To be selected, a company 
had to have received either a clean audit opinion or an audit opinion that is consistent with the definition of a 
qualified opinion (high information risk) used in this study during fiscal years 1989 to 1997. In addition, ail variables 
used in the study, or data required to compute those variables, had to be available. Companies with SIC codes 
greater than or equal to 6000 (banks and financial services) were excluded since those companies have a different 
type of asset composition from industrial companies. Companies receiving a qualified audit opinion were matched 
with companies receiving clean opinions on the basis of sales, total assets, industry, and fiscal year. Table 2 
summarizes the sample. 

Modeling 

A three-layer neural network model was developed using the advanced NN module in NeuroShell 2(R) version 4.0. 
A General Regression Neural Network (GRNN) with a logistic scale factor and the genetic adaptive calibration 
option in NeuroShell 2 were used to train the network. GRNNs are known for their ability to train quickly on sparse 
data sets and perform better than many other types of networks on a variety of problems [Specht, 1 991 ; Ward 
Systems Group, 1996]. GRNNs generate predictions through the use of non-parametric estimators of probability 
density functions and will not converge to a local minimum error as is often the case with other NN algorithms 
[Specht, 1991 ]. 
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TABLE 2 



The model contained an input layer with 22 neurons (one for each input variable), a single hidden layer containing 
378 neurons (one for each case or pattern), and an output layer containing a single neuron. No learning rate and 
momentum parameters are required in building GRNN models, but a smoothing factor is required. The smoothing 
factor is used when the network is applied to new data and determines how tightly the network matches its 
predictions to the data in the training patterns. An initial smoothing factor of 0.3 (a default in NeuroShell 2(R)) was 
applied in building the network. The genetic breeding pool was set to 50 and the auto-termination switch was set to 
off. The model was trained and tested using 261 training patterns (223 low risk and 38 high risk) and 117 testing 
patterns (99 low risk and 1 8 high risk). Testing patterns were selected from the original sample of 378 companies 
using a randomly generated number drawn from a Bernoulli distribution with a 30 percent probability of selection in 
the testing sample. Similar to Hansen et al. [1992], this facilitated the selection of five alternative samples for 
conducting sensitivity analysis on the neural network model. 

The performance of the model was assessed by computing and evaluating the Type I and Type 2 error rates it 
generated. A Type I error results if a company is classified into the high information risk group when the company 
did in fact receive a clean audit opinion and, therefore, belonged to the low information risk class. A Type 2 error 
occurs when a company that is classified into the low information risk group is in fact a high information risk 
company. Both Type I and Type 2 errors committed during the planning phase of an audit create problems for 
auditors. On one hand, a Type I error committed during the planning stage of an audit causes the auditor to 
undertake more work than is necessary. On the other hand, a Type 2 error increases litigation risk and exposure 
and can jeopardize the survival of the audit firm. Results from the NN model are benchmarked against 
classifications generated by a linear discriminant analysis model based on the same training and testing samples 
used in the NN model. This is consistent with other similar studies that have used NNs [Yang et al., 1999; Kumar et 
al., 1997; Jain and Nag, 1997; Barniv et al., 1997; Etheridge and Sriram, 1997; Altman et al., 1994]. The relative 
performance of the two models is assessed by comparing the Type 1 and Type 2 error rates. 
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The relative performance of the two models is also assessed by comparing the cost of misclassification for each 
model. The following model, adapted from Masters [1993] and Cheh, Weinberg and Yook [1999], is used to 
compute the cost of misclassifying the level of information risk: 

C = (1-q)p A sub 1 A c A sub 1 A + qp A sub 2 A C A sub 2 A 

where, 

C = total cost of misclassification; 

q - a priori probability, that the information risk associated with a company's financial statements is high; 

p A sub 1 A , p^ub 2 A probability of a Type 1 and Type 2 error, respectively (estimated by the Type 1 and Type 2 error 
rates for the classification model); 

c^ub I'^sub 2 A the cost of Type 1 and Type 2 errors, respectively. 

The a priori probability that information risk associated with a company's financial statements is high, q, can be 
estimated from the proportion of high risk information entities in the training sample. Ideally, the training sample 
should have the same proportion of high information risk entities as the population of entities that are audited by 
external auditors. As this is not the case in this study, the estimate of q will reflect the biases inherent in the 
procedures employed to select the sample of cases examined. 

The costs of Type 1 and Type 2 errors (c^sub 1 A and c^sub 2 A ) are more difficult to estimate. A Type 1 error 
committed during the preliminary stages of an audit may have a significant impact on the scope and intensity of 
audit programs and may affect the efficiency of the audit. Presumably, the auditor eventually recognizes the error 
and does not issue an incorrect audit opinion. However, a Type 2 error committed during the preliminary stages of 
the audit could lead to the issuance of an incorrect audit opinion. 

It seems plausible that the probability and associated cost (including litigation costs) of issuing an incorrect audit 
opinion is higher when, during the preliminary risk assessment stage of the audit, a Type 2 error is committed than 
when a Type 1 error is committed. Thus, the expected cost of a Type 2 error should be higher than the expected 
cost of a Type 1 error [Lenard etal., 1995; Salchenberger, etal., 1992]. Because the cost of a Type 1 or a Type 2 
error depends on a number of factors that vary across audit engagements, the ratio of the two costs rather than an 
absolute amount is used in computing the cost of misclassification. The cost of misclassification is computed for 
twenty ratios of Type 2 error cost to Type 1 error cost (c2/cj) based on the assumption that the expected cost of a 
Type 2 error will be higher than the expected cost of a Type 1 error. 



Univariate tests were performed to examine whether the variables used in the study are significantly different for 
the high and low information risk samples. Results are shown in Table 3. Except for receivables to total assets 
(RTA), all company variables are significantly (p-value<.10) different across the high information risk and low 
information risk samples. 

In general, high information risk firms have higher market risk and higher levels of financial leverage than low 
information risk entities. High information risk firms also have a lower proportion of total inventories and current 
assets in their asset structure, less short-term liquidity, higher market risk (beta), and tend to be less profitable than 
low information risk entities. Firms with low information risk have a higher proportion of big-five auditors than high 
information risk entities. 
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Classification 



The classification results from the NN model and the discriminant analysis model are shown in Table 4. Type 1 
error rates for the NN model are zero and two percent for the training and testing samples, respectively. In other 
words, the NN model correctly classified 100 percent of low information risk companies in the training sample and 
almost 98 percent of low information risk companies in the testing sample. 



The model misclassified 2.63 percent and 27.78 percent of the high information risk entities in the training and 
testing samples, respectively. The sum of the Type 1 and Type 2 error rates (referred to in the literature as the 
combined error rate [Loebbecke and Steinbart, 1997; Green and Calderon, 1995]), are 2.63 percent and 29.80 
percent for the training and testing samples, respectively. Given that a purely random process has a combined 
error of 100 percent [Loebbecke and Steinbart, 1987; Green and Calderon, 1995], it can be concluded that the NN 
model performed significantly better than such a process. 

Furthermore, the NN model performed significantly better than the discriminant analysis model in classifying both 
low and high information risk entities. The discriminant analysis model correctly classified 82.29 percent of the low 
information risk entities and only 47.06 percent of the high information risk entities in the testing model. This yielded 
a Type 1 error rate of 17.71 percent and a Type 2 error rate of 52.94 percent. The model's combined error rate of 
70.65 percent (testing sample) is significantly higher than combined error rate generated by the NN model. 
Although the discriminant model performed fairly well in identifying low information risk entities (17.71 percent Type 
1 error rate), the model performed no better than a fair coin toss in classi g high information risk entities. 
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TABLE 4 



In Table 5, various assumptions are used about the ratio of the cost of Type 2 and Type 1 errors (c A sub 2 A /c A sub 
1 A ) and shows the cost of reclassification for the NN model, the discriminant analysis model, and a preliminary 
information risk analysis based only on a fair coin toss. In a situation where c A sub 2 A = c*sub 1 A , the cost of 
misclassification when using the discriminant analysis model for preliminary audit risk assessment is almost four 
times the cost of using the neural network model. Similarly, a risk assessment model that is only as good as a fair 
coin toss has a misclassification cost that is approximately ten times higher than the cost associated with the 
neural network model. 

As shown in Figure 3, the absolute value of the difference between the misclassification costs increases as the ratio 
of c2 to c^sub 1 A increases. 

To assess the stability of the NN results, five sets of training and testing samples were created from the full sample 
of 378 cases using a randomly generated Bernoulli distribution with a 30 percent probability of selection into the 
testing sample. Independently generated NN models were then created and evaluated for each of the five samples. 
The same NN architecture was used for all models. Results presented in Table 6 show that the NN model 
consistently produces combined error rates of less than 30 percent. 

The average Type 1 error rate for the testing sample is under two percent, with a standard deviation of 1 .085 
percent. The average Type 2 error rate is 21 .722 percent, with a standard deviation of 8.214 percent. The results 
suggest that the NN model consistently identifies the information risk class of cases in the testing sample 
significantly better than either a discriminant analysis model or a process based on the toss of a fair coin. 





Enlarge 200% 



Enlarge 400% 



TABLE 5 



http://proquest.umi.OT 3/2/04 




FIGURE 3 



Enlarge 200% 



Enlarge 400% 



..'si 

$ is 

I! 

■-f - 



13' 



IIS * ' 



If 



SB 

lis? 



15! 



Enlarge 200% 



Enlarge 400% 



TABLE 6 



LIMTATIONS 



The general limitations of NNs must be considered in assessing the results reported in this paper. By manipulating 
the number of neurons and the number of hidden layers, a neural network model can be made to learn the 
underlying patterns of practically any data set [Gurney, 1997]. However, the fact that a model learns the underlying 
pattern in a data set does not necessarily imply that the model will predict or classify effectively when confronted 
with data it has not previously seen. 



By memorizing the patterns in a training sample, a model might perform very well on the training patterns but may 
perform poorly on the testing or validation patterns [Gumey, 1997; Haykin, 1994]. Cogger and Fanning [1997] 
demonstrate this potential problem, which is often referred to as overtraining. Their NN model produced a 
combined error rate of 24 percent based on the training sample and 81 percent based on their testing sample. 
Though the risk of overtraining is not eliminated in the current study, the gap between the combined error rates for 
the testing and training samples is much smaller in the current study (average, 20.549 percent) than in Cogger and 
Fanning [ 1997]. 



It is essential that NN models be thoroughly tested with data that were not used in building the model before being 
applied in the field. Validation based only on the testing sample may not be sufficient. While the testing sample was 
not used directly in building the model, it was used in deciding when to stop training the network. In response to this 
concern, the NN model developed in this study was tested on multiple testing samples and performed significantly 
better than discriminant analysis in all cases. However, additional testing on new data not previously seen by the 
model is needed before it can be conclusively established that the model will perform well in the field. 



NN models do not currently allow researchers to assess the statistical significance of variables used in the model 
and it is difficult to explain the model conceptually. While models such as linear regression produce a set of 
coefficients that could be tested to draw inferences, NN models do not produce information that may be used for 
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drawing inferences and assessing the significance of input variables. Therefore, no attempt is made in this study to 
assess the statistical significance of any of the variables in the NN model. Similarly, no attempt is made to express 
the output of the NN as a function of the input. 

An NN model is like a black box. The input and output are observable, but internal processes used to link the input 
to output are not. The classifications or predictions made by a NN model may, therefore, be difficult to explain and 
justify. Nonetheless, once a NN model is trained and tested, the model can be easily applied to make classifications 
or predictions from new data. 

DISCUSSION AND CONCLUSION 

The NN model examined in this paper performed significantly better than both a discriminant analysis model and a 
naive risk assessment model based on a fair coin toss. The NN model was superior in terms of the combined error 
rate and also in terms of the cost of m iscl ass ifi cation. 

If risk assessment were based on a fair coin toss, the combined error rate would be 100 percent [Loebbecke and 
Steinbart, 1987]. By contrast, the NN model employed in this study had an average combined error rate of 
approximately 23 percent The discriminant analysis model, on the other hand, had a 70 percent combined error 
rate, which is lower than the 100 percent threshold for a naive risk assessment procedure based on a fair coin toss, 
but significantly higher than the observed error rate for the NN model. Similarly, the cost of misclassification based 
on the discriminant analysis model is over three times the cost of misclassification when using the NN model. 

The results reported in this paper support the findings of recent studies such as Lenard et al. [1 995] and Green and 
Choi [1997], which indicate that NN models may be more effective than other models in certain areas of audit risk 
assessment such as predicting going concern audit opinions and in identifying financial statement fraud. The 
results demonstrate that NN models are potentially powerful tools in assessing preliminary information risk. 

Implications 

There are many reasons why using an NN model in audit risk analysis might be advantageous. Unlike conventional 
models, no distributional or other statistical assumptions are required. The NN model iteratively searches for a set 
of weights that minimizes the difference between actual and predicted output. Because NN models make 
predictions or classifications by learning from every observation in a sample, non-linearity and interdependencies 
among variables do not impose a constraint. Once a network has been trained and tested, it is easy to use the 
model to classify new data. 

Preliminary analytical procedures using traditional techniques require auditors to undertake a multi-step process 
that involves developing an expectations model to predict account balances or ratios, and specifying a decision rule 
to identify deviations between actual and expected amounts or ratios that should be farther investigated [Calderon 
and Green, 1994]. The results presented in this study suggest that this process may be simplified and enhanced by 
using a well-designed NN model. The NN model can generate the expectation model, apply the decision rule, and 
produce an output that tells the auditor whether an account balance or ratio is correct or incorrect, or should be 
investigated or not investigated. 

NN models can also be easily adapted or retrained to respond to changes [Haykin, 1994] in an audit environment. 
The effectiveness of the NN model may be ftu-ther enhanced by combining it with an expert system [Wu, 1994; 
Osyk and Vijayaraman, 1995; Davis et al., 1997]. The expert system could identify and capture qualitative factors 
that are important in information risk assessment and process them as input into the NN model. Alternatively, the 
NN model could process quantitative data and enter its output into an expert system. 

[Footnote] 

ENDNOTES 

[Footnote] 

1 Information risk is defined as the risk that financial statements will be materially false and misleading [Robertson and 
Louwers, 1999]. 

2. This item corresponds to response category 3 in the authors' multicategory ordinal scale. 

[Reference] 
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[Headnote] 

Tony Brabazon reports that the genetic algorithm offers a new tool for developing solutions to business problems and is 
currently in use in various financial 
institutions. It is likely to spread from there 

The last five years have seen some dramatic changes in the way businesses can make use of computers. 
Computers are starting to think! The speed of the computer is being allied with ideas drawn from artificial 
intelligence research to produce more 'human* computer systems which process information in a similar way to the 
human brain. The aim is to improve the problem-solving and decision-making capabilities of computer systems. 

Making computer systems more human is no easy task for many reasons, not least of which is the problem that we 
don't properly understand our own thought processes. People can make decisions based on their past experience, 
even when only fuzzy (uncertain) information is available. Until recently, this seemingly simple process was 
completely beyond the capabilities of computer systems. The last few years have seen this change. Neural 
networks (see Management Accounting, April 1995) attempt to simulate the self-teaching process of the human 
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brain, whereby people can draw on past experiences in making decisions. Computerised decision-making systems 
based on these principles are spreading quickly throughout the business world, most notably in the areas of finance 
and marketing. Can biology provide any other useful ideas for the design of computer systems? 

Curiously, the concepts of Darwinian evolution provide a new springboard for the development of more human-like 
computer systems. Darwinian evolution is based on the idea of 'survival of the fittest 1 . The organisms which are 
best adapted to their environment survive and reproduce, the less well adapted perish. Over many generations, 
evolution should ensure that organisms are well adapted to their particular environment. It is possible to apply these 
evolutionary ideas to 'breed' computer systems which provide a solution to a particular problem. Techniques which 
adopt this approach are known as 'genetic algorithms'. The idea of introducing genetic concepts to computer 
system development was first popularised by John Holland in 1975 but it is only within the last five years that they 
have been used in business computer systems. 

So how do genetic algorithms work? While there are a vast number of ways to implement them, their basic 
workings can be simplified to four steps. Initially a set of possible solutions (perhaps a set of decision rules) are 
generated for the problem under review. Each of these 'candidate* solutions are evaluated and a subset of them 
(the best ones) are selected for breeding. These 'better' solutions (those solutions which best solve the problem at 
hand) breed by swapping (genetic) information. This swapping procedure may be as simple as combining random 
parts of each parent solution to form a child solution, although more complex breeding strategies may be 
implemented. The offspring hopefully contain most or all of the good characteristics of their parents. The child 
solutions are evaluated and the better of them breed in turn. Over many generations, the solutions will evolve to 
become better at solving the original problem. The weeding out of the poorer solutions in each generation cycle is a 
direct parallel to Darwin's idea of survival of the fittest. The solutions at each evolutionary stage can be further 
improved by introducing 'mutation' into the breeding process. Random elements of the child solution are changed 
slightly (mutated). This introduces new elements into the next generation and hopefully leads to new improved 
characteristics in the child solutions. Mutation enables the algorithm to search for the optimal solution to the 
problem even if the initial parent solutions were poor. 

The above idea can be used to evolve rules for use in expert decision-making systems. Several parent sets of rules 
are initially created. While it is possible that none of these sets of rules produces good decisions, the better sets of 
rules breed and give rise to a new generation of sets of rules. Through many generations of breeding and 
mutations, the process will hopefully give rise to sets of rules which provide very good solutions to the problem at 
hand. 

What are the advantages of genetic algorithms? Due to their evolutionary nature, genetic algorithms can 'self 
teach' and can adapt well to changes in their environment. Genetic algorithms can uncover new, previously 
unknown ways to do things. One example of this occurred in the design of a jet engine for the ©Boeing 777. A 
research team at GE using genetic algorithms discovered a novel way to redesign the fan blades. This resulted in 
an engine design which was more fuel efficient resulting in notable cost savings for airlines. 

Genetic algorithms are potentially useful in several areas of business. Financial institutions were among the early 
adopters of these ideas. Genetic algorithms have been used to develop rules for expert systems for vetting of loan 
applications. Research has shown that bankruptcy prediction models developed using genetic algorithms can 
outperform traditional models based on multiple discriminant analysis. Genetic algorithms can also be used to 
develop expert rules for insurance risk assessment. The resulting measure of risk can be used to price the 
insurance premium. Other developed uses for genetic algorithms include securities and currency trading. Not all of 
the applications of genetic algorithms are financial. Genetic algorithms have been used to solve scheduling 
problems. Indeed, some television networks have used genetic algorithms to assist in scheduling advertising spots 
during commercial breaks. 

Computer systems using genetic algorithms need not be developed in isolation from other artificial intelligence 
techniques. Increasingly, companies are developing 'hybrid' computerised decision-making systems which combine 
several types of artificial intelligence. It is not surprising that this should lead to better decision systems as the 
hybrid more closely resembles true human thinking than could a single type of artificial intelligence technique. By 
combining several types of artificial intelligence it is possible to incorporate the strengths of each individual 
technique. For example, genetic algorithms may be combined with neural networks to 'evolve' neural networks 
which are particularly well suited for a specific application. A number of financial institutions use such hybrid 
systems for securities and currency trading purposes. 

As knowledge of genetic algorithms, their uses and potential benefits has spread, an increasing variety of software 
to help implement them has become available. Some of the better known packages include EOS VBX by Man 
Machine Interfaces Inc, Evolver by Axcelis Inc and GeneHunter (an Excel add-in) by Ward Systems Group. 
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Logivolve by Scientific Consultant Services Inc is designed to apply concepts from genetic algorithms in the 
construction of neural networks. 

Computerised decision-making systems are becoming ever more complex. Builders of such systems can use 
genetic algorithms to improve their capabilities. Applying these concepts leads to systems which are flexible, 
adaptable and capable of learning from their environment. Businesses that carefully develop and implement such 
systems stand to gain significant competitive advantage. 
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